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Abstract 

In this paper, we analyze the spectrum occupancy using different machine learning techniques. Both 
supervised techniques (naive Bayesian classifier (NBC), decision trees (DT), support vector machine 
(SVM), linear regression (LR)) and unsupervised algorithm (hidden markov model (HMM)) are studied 
to find the best technique with the highest classification accuracy (CA). A detailed comparison of the 
supervised and unsupervised algorithms in terms of the computational time and classification accuracy 
is performed. The classified occupancy status is further utilized to evaluate the probability of secondary 
user outage for the future time slots, which can be used by system designers to define spectrum allocation 
and spectrum sharing policies. Numerical results show that SVM is the best algorithm among all the 
supervised and unsupervised classifiers. Based on this, we proposed a new SVM algorithm by combining 
it with fire fly algorithm (FFA), which is shown to outperform all other algorithms. 

Index Terms 

Fire fly algorithm, hidden markov model, spectrum occupancy and support vector machine. 
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I. Introduction 

A cognitive radio network (CRN) is composed of two types of users, namely, the licensed 
primary users (PU’s) and the unlicensed secondary users (SU’s). The core idea behind CR is 
to allow unlicensed user’s access to the licensed bands in an opportunistic manner to avoid 
interference with the licensed users. To achieve this, a realistic understanding of the dynamic 
usage of the spectrum is required. The spectrum measurement is an important step towards 
the realistic understanding of the dynamic spectrum usage. Various spectrum measurement 
campaigns covering a wide range of frequencies have been performed (T). These pectrum 
measurements studies have found significant amount of unused frequency bands in the case 
of normal usage due to the static spectrum regulations. This has led researchers to understand 
the spectrum occupancy characteristics in depth for exploiting the free spectrum. 

A. Problem definition 

Many studies have been performed to understand the occupancy statistics. For instance, the 
statistical and spectral occupation analysis of the measurements was presented in [[21 in order to 
study the traffic density in all frequency bands. In 01, autoregressive model was used to predict 
the radio resource availability using occupancy measurements in order to achieve uninterrupted 
data transmission of secondary users. In [®, the occupancy statistics were utilized to select 
the best channels for control and data transmission purposes, so that less time is required for 
switching transmission from one channel to the other for the case when the PU appears. Further, 
In [[51, [j6l, the bandwidth efficiency was maximized by controlling the transmission power of 
cognitive radio using spectrum occupancy measurements. 

In [13, different time series models were used to categorize specific occupancy patterns in the 
spectrum measurements. All of the aforementioned works have evaluated the spectrum occupancy 
models by using conventional probabilistic or statistical tools. These tools are often limited due 
to assumptions required to derive their theories. For example, one has to determine whether the 
value is random variable or a random process in order to use either probabilistic and statistical 
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tools. On the other hand, machine learning (ML) is a very powerful tool that has received 
increasing attention recently [[SJ. The machine learning algorithms are often heuristic, as they 
don’t have any prerequisites or assumptions on data. As a result, in many cases, they provide 
higher accuracy than conventional probabilistic and statistical tools. There are very few works 
on the use of ML in spectrum occupancy. For example, the ML works related to CR in ||9ll - 
|[T3ll discussed cooperative spectrum sensing and spectrum occupancy variation. However, in this 
paper, we aim to provide a comprehensive investigation on the use of ML for analyzing spectrum 
occupancy. The motivation is that different ML algorithms are often suitable for different types 
of data. Thus, one needs to try different ML algorithms in order to find the one that suits the 
spectrum data best, not just one ML algorithm. 

B. Contributions 

The contributions are listed as follows: 

1. We propose the use of ML algorithms in spectrum occupancy study. Both supervised and 
unsupervised algorithms are used. The machine learning techniques are advantageous because 
they are capable of implicitly learning the surrounding environment and are much more adaptive 
compared with the traditional spectrum occupancy models. They can describe more optimized 
decision regions on feature space than other approaches. In [0 and iflOl . ML was used for 
cooperative spectrum sensing. However we use ML for spectrum occupancy modelling that may 
be used in all CR operations, including spectrum management, spectrum decision and spectrum 
sensing. In iflTll . authors have discussed call-based modelling for analyzing the spectrum usage 
of the dataset collected from the cellular network operator. Further, they have shown that random 
walk process can be used for modeling aggregate cell capacity. However, we use ML to model 
spectrum occupancy in time slots for all important bands. 

2. We have utilized four supervised algorithms, naive Bayesian classifier (NBC), decision trees 
(DT), support vector machine (SVM), linear regression (LR), and one unsupervised algorithm, 
hidden markov model (HMM), to classify the occupancy status of time slots. The classified 
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occupancy status is further utilized for evaluating the probability of SU outage. In Ifl2l . HMM 
was used to predict the channel status. Our supervised algorithms and modified HMM all perform 
better than HMM. In lfl3l . LR was used to investigate the spectrum occupancy variation in time 
and frequency. Our approach outperforms LR as well. 

3. We propose a new technique that combines SVM with fire fly algorithm (FFA) that 
outperforms all supervised and unsupervised algorithms. 

The rest of the paper is organized as follows: Section II explains the system model, followed 
by the detailed explanation of classifiers in Section III. The numerical results and discussion are 
presented in Section IV. 


II. System Model 

A. Measurement setup and data 

We have measured the data from 880 MHz to 2500 MHz containing eight main radio frequency 
bands for approximately four months (6th Feb-18th June 2013) at the University of Warwick 
using radiometer. The eight bands are: 880-915 MHz, 925-960 MHz, 1900-1920 MHz, 1920-1980 
MHz, 1710-1785 MHz, 1805-1880 MHz, 2110-2170 MHz and 2400-2500 MHz. The number 
of the frequency bins in each band varies. For example, the band 925-960 MHz contains 192 
frequency bins, each occupying a bandwidth of 0.18 MHz, while the band 1710-1785 MHz 
contains 448 frequency bins, each occupying a bandwidth of 0.167 MHz. The data is arranged 
in a two dimensional matrix itj, f 3 ) for each band; where each row t t represents the measured 
data at different frequencies in one minute while each column fj represents the data at different 
time instants of each frequency bin. As we have measured the data for four months which 
constitute 131 days (188917 minutes), the numbers of rows are 188917 while the number of 
columns varies according to the number of the frequency bins in a particular band. 
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B. SU Model 

In a network of licensed users, SU is allowed to access the licensed band without causing 
any harmful interference to the PU. Let i denote the time slot and j denote the frequency bin, 
where i — 1,2, ..n, j = 1 , 2 , ...k, n represents the total number of time slots and k represents 
the total number of frequency bins. Using energy detection ||T4ll . if if (j ) is the sample sensed 
at the i th time slot in the j th frequency bin. One has 

y\j) = x\j) + w*(j) (la) 

or y\j ) = w'(j) (lb) 

where x l (j ) represents the received PU signal and w'(j) represents the additive white Gaussian 
noise (AWGN) with zero mean and variance . Each sample is compared with a threshold ( 7 ). 
The selection of 7 is very important because small values of 7 will cause false alarms while large 
values will miss spectrum opportunities. The computation of 7 was explained in IfTFII . In our 
approach, the threshold is dynamic and its selection is explained in Section IV-B. The spectrum 
status is given as 

1 , y\j) > 7 

s\j) = 

[ 0 , y i {j) < 7 . 

The occupancy for the ith time slot for all k frequency bins is defined as 

0C i = Ej=i $ U) ( 2 ) 

k 

For example, a three minutes interval for the band 880 - 890 MHz having 9 frequency bins 
is shown in FigJT] where each bin occupies 1MHz. For each frequency bin, S l (j) is decided. 
Once S l (j) is evaluated, the occupancy OC 1 is calculated using ©. It is observed that more 
frequency bins are occupied for the first minute than for the second and third minutes so that it 
has less chance for SU to transmit. Following the discussion above, we need to set the criteria 
for quantifying this chance based on the occupancies. 
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880 MHz 


Frequency 



63 % occupancy 


18 % occupancy 


27 % occupancy 


Fig. 1. Occupancy for different time slots in the band. 

C. PU Model 

As per our approach, the status of PU (P l ) for each i th time slot can be decided using the 
following rules: 

1, OC n > U oc (Condition 1) 

1, L oc <= OC l <= U oc AND con 1 < B (Condition 2) 

0, L oc <= OC l <= U oc AND con 1 >= B (Condition 3) 

0, OC l < L oc (Condition 4) 

where U oc and L oc represents the maximum and minimum values of occupancy for all n time 
slots, con 1 represents the number of consecutive free frequency bins in each ith time slot and 
B represents the maximum value of con 1 , when PU is considered present. Each condition is 
explained as follows: 

1. Condition 1 and Condition 4: The values of U oc and L oc vary with the frequency band, the 
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day and the threshold. Our test show that U oc should not be less than 75% and L oc should not be 
greater than 40%. For fixed frequency band and day, we have evaluated U oc and L oc for different 
thresholds in Section IV-B. In order to guarantee PU protection and ensure SU transmission 
when the values of OC l lie in the range between L oc and U oc , further criterion is applied. 

2. Condition 2 and Condition 3: When L oc <= OC l <= U oc , it is difficult to apply condition 
1 and condition 4. So we evaluate con 1 for each time slot. If con * > B for L oc <= OC l <= U oc 
, there exists at least B consecutive free frequency bins in ith time slot; thus SU can transmit 
and vice versa when con* > B. The value of B is selected to provide PU protection. This will 
be explained in Section IV-B. 

D. Machine Learning Framework for SU and PU Model 

ML constructs a classifier to map S* to P\ where S* = [^(l), *S*(2), ..S'*(fc)] represents the 
feature vector and P l is the corresponding response to the feature vector. There are two steps 
for constructing a classifier: 

1) Training: Let S\ rain = [S l (l) tra in, S l (2) train ..., S l (k) tra in] T denote the training spectrum 
status and Pfi ain represent the training PU status for the ith time slot respectively, where i = 
1,2, ..nl and n\ represents the number of training time slots fed into the classifier. 

2) Testing: Once the classifier is successfully trained, it is ready to receive the test vector for 
classification. Let S l test = [S\l)test,S\2) test ...,S\k) test ] T denote the testing spectrum status and 
Pj est represent the testing PU status for the ith time slot respectively, where i = nl+1, nl+2, ..n2 
and n 2 represents the length of testing sequence. It is assumed that n = ri\ +n 2 . For our proposed 
approach, the matrix of size n * k is divided into 15% training data matrix of size n 2 * k and 
85% testing data matrix of size n 2 * k. The value P^ est is not used during the testing but as a 
reference for computing the classification error. 

3) Classification Accuracy (CA): Let Pf al denote the PU status determined by the classifier 
for the ith time slot. The classifier categorizes the testing vector S l test as ’occupied class’ (i.e., 
Pfai = 1) o r ’unoccupied class’ (i.e., P* val = 0). Therefore, the PU status is correctly determined, 
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when P l eva i= Pl es t , giving CA l = 1. The misdetection occurs, when P* val = 0 and Pl est = 1 
while false alarm occurs, when P l eval = 1 and P^ est = 0, giving CA l = 0. 

E. Probability of SU outage 

Let P l eval be a vector of length ((n 2 — nf) + 1) evaluated by each classifier, and Pf al represent 
the presence/absence of PU for the i th time slot. When P\ vai = 0, SU is allowed to utilize the 
i th time slot. Define out su as the minimum value of consecutive free time slots required by SU 
for transmission. SU outage occurs, when SU cannot find out su consecutive free time slots in a 
vector P l eval of length ((n 2 — nf) + 1). The probability of SU outage is given by 


P{SU outag e) 1 ^ (S' Utransmit .) 


(3a) 


where 

c 

PiySJJ transmit) E P(FB C ) (3b) 

C—1 

where FB C represents the block of free consecutive time slots of length out su , c = (1,2,.. C } 
and C represents the total number of free blocks present in P'f aJ ■ The probability for a free 
block starting at index, say r in P^ vai is evaluated using the following equation 

r+outsu 

P(FB C ) = H OC\ (3c) 

i—r 

III. Proposed Algorithms 


In the proposed approach, five machine learning algorithms are utilized to predict the future 
PU status using the occupancy data, which is a function of time, frequency and threshold. 
Among them, four are supervised learning algorithms: NBC, DT, SVM and LR, while one is 
an unsupervised algorithm, HMM. The motivation to use five different algorithms is to find the 
best machine learning algorithm as they have different characteristics. 
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A. Naive Bayesian Classifier 

A Naive Bayesian classifier is a generative model based on the Bayes theorem. It is also called 
’independent feature model’ because it does not take dependency of features into account. The 
feature vector for the ith time slot in our model contains all the samples which are independent of 
each other, since every feature represents a specific frequency bin. For example, the status vector 
of the ith time slot is given as S* = S l (l), S l (l), S l (2),S^k), where ^'(l) is independent 
from S l (2). However, the response variable in our approach i.e. PU status (P l ) is a dependent 
variable which is affected by each frequency bin. As our features are independent, so we will 
use NBC for classification. The probability of S' belonging to the class P l evaluated using the 
Bayes theorem is formally defined as f[T6ll 

p(P\S*) =p(P i ) *p(S i \P i ). (4) 

when P l = 0, S' will be classified as ’idle’ class, while when P l = 1, S' will be classified 
as ’occupied’ class. The goal is to find the class with the largest posterior probability in the 
classification phase. The classification rule is given as 

classify^ S') = argmax S i{p(P\ (S')} (5) 

where S' = {S' l (l), iS' l (2)... < S'*(fc)}. NBC is sensitive to the choice of kernel and the prior 
probability distribution of classes. This will be explained in Section IV-B. 

B. Decision Trees 

Decision tree builds classification or regression models in the form of a tree structure. The 
decision trees used in this approach are classification trees whose leaf represents the class labels. 
Unlike NBC, it can handle feature interactions and dependencies. In DT, the decision is made 
on each internal node which is used as a basis for dividing the data into two subsets while leaf 
nodes represent the class labels (in the case of classification trees) or the real numbers (in the 
case of regression trees). Data come in the form 
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where P l is the dependent variable representing the class label of ith time slot. The class labels 

P l are assigned by calculating the entropy of the feature, as ifTTI 

z 

Entropy (t) = — s' p{id\t) log 2 p(id\t). (7) 

id =0 

Where p(id\t) denote the fraction of records belonging to class id at a given node t and Z 
represents the total number of classes. In our approach, Z — 1. The smaller entropy implies that 
all records belong to the same class. It will be discussed in Section IV-C on how fraction of 
records per node affects the classification accuracy of DT. 

C. Support Vector Machines 

SVM is a discriminative classifier with high accuracy. Unlike DT, it prevents over-fitting and 
can be used for online learning [fl8ll . There are two types of classifiers in SVM: linear SVM for 
separable data and non-linear SVM for non-separable data. The linear classifier is used here. The 
training feature and response vectors can be represented as D = (P\ S') where P 1 e {0,1} . The 
two classes are separated by defining a random division line H represented as d.S l +h = p, where 
d and b represent the weighting vector and bias, respectively, while p represents the constant 
for dividing two hyper planes. The maximum-margin hyper planes that divide the points having 


1 from those P l = 0 are given as: 



P l = +1 when d. S* + b > p 

(Occupied Class ) 

(8a) 

P l = 0 when d. S* — b < p 

(Idle Class ) 

(8b) 


The separation between two hyper planes is margin, controlled by the parameter called box 
constraint Box ct . We have evalauted the optimal value of Box ct using a bio-inspired technique 
i.e. FFA in our approach. 
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D. SVM with Fire Fly Algorithm 

In FFA, let X be a group of fire flies, X = [Zi, l 2 , ..lx], initially located at specific positions 
ax = [a/, ■ o/ 2 , ..ai x \. Each fire fly moves and tries find a brighter fire fly, which has more light 
intensity than its own. The objective function f{x) used for evaluating the brightness of the fire 
fly in our approach is the classification accuracy i. e. f(x) = CA(ax)■ When a fire fly, say l\ 
finds another brighter fire fly l 2 at another location having more intensity compared to its own, 
it tends to move towards fire fly l 2 . The change in position is determined as li20l 

o" +1 = a][ + l3oe~^ hhrd ^ l, 2 ( a v l2 — ajj) + a(rand — 0.5) (9) 

where v represents the number of iterations, a;, and ai 2 represents the position of fire fly l\ and 
l 2 respectively, a, f3 0 and iPi l i 2 are constants and rand is a uniformly distributed random number. 
For our approach, the starting positions of the X fire flies are initialized, while the position of 
each fire fly represents the value of box constraints Box ct . 

E. Linear Regression 

The flexibility of linear regression to include mixture of various features in different dimensions 
e. g. space, frequency, time and threshold as a linear combination is the main motivation of using 
it for modeling in this approach. The linear regression model for our approach is given by: 

k 

P l = e 0 + e 1 S i ( 1) + e 2 S'*(2) + ... + e k S\k) = e 0 + ^ ej S\j). (10) 

3 =1 

where the class label P ! is represented as a linear combination of parameters e t ,e 2 . ■ f'k and 
features (£ l (l), S l (2 ),.., S l (k )) in the ith time slot. The stepwise-linear regression is used in this 
approach. In each step, the optimal term based on the value of defined ’criterion’ is selected. The 
’criterion’ can be set as the sum of squares error (SSE), deviance, akaike information criterion 
(AIC), Bayesian information criterion (BIC) or R-squared etc. SSE is used in this approach. 
The small values of SSE are encouraged for a good model. It is observed from (Hob , that the 
computational time for evaluating the response of the model linearly increases with the number 
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of frequency bins/ predictors involved. So we need to select an appropriate number of predictors 
for linear regression. 

1F. Hidden Markov Models 

It is an unsupervised algorithm for modeling the time series data. The motivation to use the 
unsupervised algorithm is that it does not need the training phase. In HMM, the sequence of 
states can be recovered by an analysis of the sequence of observations. The set of states and 
observations are represented by U and G given as U — (u\, u 2 , ---Un), G = (gi,g 2 , where 

u± and a 2 represent the states when P l = 0 and P' = 1, respectively. The observations g\ and 
g 2 represent the value of OC corresponding to each P 1 . HMM is defined as 

A =(C h ,D h ,n) (11) 

where the transition array Ch is the probability of switching from state a , to state u 2 given as 
lf2TI . C h = [c l2 ] = P(q t = u 2 \q t -i = Ui). The D h is the probability of observation g l being 
produced from state, D h = [d ]:2 = P(o t = g\, 2 \qi = u 2 ) and tt is the initial probability array, 

7T = P(<?i = U 2 ). 

HMM has two main steps. In the first step, the sequence of observations O = (oi,o 2 , 
transition probability matrix Ch and emission probability matrix Dh are utilized to find the 
probability of observations O given hmm model A given in ( OTIl . Eq.13) as, P(0 |A) = 
J2q P (0\Q,\)P{Q\\), where Q = (q u q 2 , ...q T ) and P(0\Q,X) = Y\J=i p (o t \q t: X) = g qi {o i) * 
g q2 (o 2 )..g qT (oT)- The probability of the state sequence is given as P{Q\X) = ir qi c qiq2 c q2q3 ...c qT _ iqT . 
In the second step, the hidden state sequence, that is most likely to have produced an observation 
is decoded using the viterbi algorithm. The most likely sequence of states Ql generated using the 
viterbi algorithm is matched with the expected fixed state sequence Q to compute classification 
accuracy. HMM can be also be supervised by adding two extra steps as 
Step(a): Use the initial guesses of Ch and D h to compute 0 and O, that are used for computing 
P(0|A) in forward algorithm 
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Step(b): Use O, D h and C h in Step(a) to estimate the transition probability matrix C'/,/ and 
emission probability matrix using maximum likelihood estimation Il22ll . 

The and Df,' collectively form the estimated HMM model (A e ) that can be further used for 
evaluating P(0 |A) and Ql using the forward algorithm and the Viterbi algorithm respectively. 

IV. Numerical Results and Discussion 

In order to analyze the occupancy of the eight bands, the statistics of data in all bands from 880 
to 2500 MHz are presented in Section IV-A. The classification criteria are explained in Section 
IV-B. The selection of the best parameters for each model using the classification criteria are 
discussed in Section IV-C. The classification models with the optimal parameters are compared 
to find the best classifier in terms of the CA, defined as CA = No of conect dassfications 

’ lotal number ot test samples 

A. Statistics of Data 

The CDF plot is shown in Figj2] which gives the summarized view of all power ranges for 
the eight bands. It can be observed from Fig{2] that the eight bands can be categorized into two 
main groups. Group A contains those bands that have wide power ranges between -110 dBm to 
-30 dBm including 1805-1800 MHz, 1710-1785 MHz and 2110-2170 MHz. Group B has five 
bands: 925-960 MHz, 880-915 MHz, 2400-2500 MHz, 1920-1980 MHz and 1900-1920 MHz 
that have power ranges between -110 dBm and -100 dBm. Thus, Group A bands have larger 
standard deviation than Group B bands. Next we discuss the effects of two main parameters 
(frequency and threshold) on occupancy. 

1) Occupancy Vs Threshold: The threshold selection is an important task for analyzing the 
occupancy of each time slot. We took the minimum and the maximum value of power for 
each frequency band and tested seven values of thresholds in this range. Each band is analyzed 
separately for the seven values of the threshold using the four months data. Due to limited space, 
only 925-960 MHz is given in Figj3] It is observed that occupancy monotonically decreases when 
the value of threshold increases. These results have proved that larger value of threshold will 
classify less samples as occupied. 
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2) Occupancy Vs Frequency: The relationship between occupancy and frequency is analyzed 
by computing the occupancy of the jth bin individually. Eq.© can be modified for computing 
the occupancy of the jth frequency bin ( OC J = ^ i=1 — ). We have found in Fig|4] a unique 
periodicity in some bands. We found that four bands can be categorized as the periodic group 
bands: 880-915 MHz, 1710-1785 MHz, 2110-2170 MHz and 2400-2500 MHz bands. The bands 
925-960 MHz, 1805-1880 MHz, 1920-1980 MHz and 2110-2170 MHz do not have this property. 

The periodicity may be caused by the usage pattern. For instance, the periodicity in each 
band lies in their uplink/downlink usage pattern. For instance, the bands 1710-1785 MHz and 
1900-1920 MHz are uplinks, while the aperiodic bands 1805-1880 MHz and 1920-1980 MHz are 
downlinks. The uplink transmits data from the mobile user to base station so that its activity is 
completely determined by mobile users’s periodic usage pattern. On the other hand, the downlink 
transmits the data from base station to the mobile user so that its activity is also affected by 
control and broadcast channels, making it less or non periodic. 

B. Classification Criteria 

This subsection studies the choice of U oc , L oc , con 1 and B in Section II-C as shown in Fig 
0 We have utilized Dayl (1-1440 min), Day 2 (1441-2448 min) and Day 5 (7200-8640 min) 
in Band 880-915 MHz, and four different values of threshold: 7 = [—102,—104,—106,—108] 
dBm. The parameters U oc and L oc will be selected by M s , which represents the occupancy 
split that divides the data into occupied and idle classes. It varies from 0.1 to 0.9 with a step 
size of 0.1. It is observed in Figj5] that the value of CA depends on day and the value of 
threshold. The actual value of OC\ rain in © always lies in a certain range, [L s , U s \, where L s 
represents the lowest value of OC\ rain and U s represents the maximum value of OC\ rain . When 
L s <= M s <= U s , two groups of classes P l = 0 (available class) and P l — 1 (occupied class) 
can be classified correctly. When M s > U s or M s < L s , all the samples will be classified as one 
class because OC\ rain is a closed set whose values do not lie outside the range [L s . U s ], This 
explains why the CA = 1 for [L oc ,U oc ] = [0.1, 0.2] and [L oc , U oc } = [0.75,0.9] while CA < 1 
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for [L oc . U oc ] = [0.2,0.75] for Day 1 using 7 = —102 dBm. Thus, the classification cannot be 
performed when M s > U s or M s < L s . The optimal range is [L oc , U oc ] = [0.2, 0.75] for CA < 1. 
However, for CA < 1, there are four different choices of threshold available. In our proposed 
approach, we choose that specific value of threshold that contains the largest number of values 
between L oc and U oc . Following this, we have selected 7 = —102 dBm for Dayl, Day2 and Day5 
as the optimal threshold which ensures the largest amount of samples between L oc and U oc . The 
[L oc ,U oc ] = [0.2,0.75] for Day 1, [L oc ,U oc \ = [0.4,0.85] for Day2 and [L oc , U oc ] = [0.2,0.80] 
for Day 5 respectively. The optimal values of 7 , U oc and L oc are further used for finding B for 
each day. 

C. Model Performance Comparison 

Following the discussion above, we have compared the performance of the algorithms in this 
section using 1 month data of Band 880-915 MHz. Our tests show that the number of minimum 
observations/node for DT can be seclected as 17, number of predictors for LR as 15, normal 
kernel for NBC and linear kernel for SVM. The optimal splitting range, optimal threshold and 
B will be selected corresponding to the data of each day. 

1) Supervised VS Unsupervised Algorithms using k = 55: In Fig. (6ta), it is observed that the 
mean CA attained by LR, SVM, DT, NBC and HMM is 0.9257, 0.9162, 0.8483, 0.9493 and 
0.4790 respectively. The mean computation time in each iteration by LR, SVM, DT, NBC and 
HMM is 350.19, 0.092, 0.0136, 0.0045, and 0.0171 seconds, respectively. Thus, NBC is the best 
considering the accuracy and complexity. 

2) Supervised vs Unsupervised Algorithms using K = 192 : We have compared HMM, 
Trained HMM, SVM, DT and NBC in Figjfjb) for 30 days. Each iteration represents 1 day. 
LR is not shown as it takes an excessively long time in this case. It is observed that trained 
HMM performed better than HMM, but worst than DT, NBC and SVM. The mean CA attained 
by Trained HMM, HMM, SVM, DT and NBC is 0.6816, 0.4887, 0.8528, 0.8392, 0.7970 while 
the computational time for each iteration of Trained HMM, HMM, SVM, DT and NBC 0.0205, 
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0.09066, 0.0135, 0.0163, 0.0095 seconds, respectively. Thus, SVM is the best in this case with 
highest CA and shortest time. 

3) SVM with Fire Fly Algorithm : So far, the best overall performance is attained by the 
linear SVM technique. The performance of linear SVM is affected by the value of Box ct as 
illustrated in Section IV-C. The fire fly algorithm can be used to select the best value of Box ct . 
We set a — 1, j3 Q — 2 and ip tl i 2 = 1.3 for FFA. Fig. [7j a) depicts that ’SVM+FFA’ performs 
better than the conventional SVM in most of the cases. The mean CA attained by SVM+FFA, 
SVM, DT, NBC and HMM is 0.8728, 0.8499, 0.7970, 0.8392 and 0.4822, respectively. 

4) Probability of SU Outage: This probability is computed using SVM+FFA, SVM, DT, NBC 
and HMM and compared with the expected P(SU outage ) to compute the difference between 
evaluated and expected values. It is evident in Fig. [71 bj that SVM+FFA has predicted the 
P{SU outa ge) with minimum difference and is very close to the expected one. The expected SU 
outage is 0.9191 in Fig. [71b) while the predicted P(SU outage ) using SVM+FFA, SVM, NBC, 
DT and HMM is 0.9264, 0.9322, 0.9638, 0.9577 and 1, respectively. The P(SU outage ) for HMM 
is always 1, which implies that HMM has failed to find any block of consecutive free time slot 
of length out su . 

5) Supervised vs Unsupervised Algorithms using different Training/ Testing Data vectors: 
We have presented the detailed comparison of supervised and unsupervised algorithms using 
different sizes of training and testing data Table 1. The classification accuracy and computation 
time for all supervised algorithms increases with an increase in the size of the training data. 
SVM+FFA has attained the highest CA but with the longest computation time in most cases. 
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6th Feb to 7th Feb 2013 



Fig. 2. The CDFs for the eight bands between 880-2500 MFIz. 


Mean Occupancy from 6th Feb-18th June 



Fig. 3. Occupancy VS threshold for Band 925-960 MFIz 
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Occupancy for band 880-915 MHz 



(a) 


Occupancy for band 925-960 MHz 



(b) 


Fig. 4. Occupancy VS spectrum frequency for (a) Band 880-915 MHz (b) 925-960 MHz. 
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Day 1 Data 



Day 2 Data 



Day 5 Data 



Fig. 5. Selection of optimal threshold ( 7 ) and optimal splitting range ([U oc , L oc ]) for determining the classification criteria of 
three days data. 


March 25, 2015 


DRAFT 



















































Classification Accuracy Classification Accuracy 


21 


Performance Comparison 



(a) 



0.4- 1 - 1 - 1 - 1 - 1 - 
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Performance Comparison 


(b) 


Fig. 6. Performance Comparison of (a) SVM, DT, NBC, LR and HMM with k = 55. (b) SVM, DT, NBC, HMM and trained 


HMM with k = 192. 
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Performance Comparison 


10 15 20 25 30 

Number of days 


(a) 


Outage Probabality 



(b) 

Fig. 7. Performance Comparison of ML algorithms: SVM, DT, NBC, HMM and 'SVM+FFA’ using k = 192 for a set of 30 
days, (b) Comparison of "expected probability of SU outage’ with the SU outage evaluated using SVM, DT, NBC, HMM and 
"SVM+FFA" using k = 192 for a set of 30 days. 
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